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Foreword 



rd , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The present document specifies the codec specific RTP protocol details applying to packet switched conversational 
multimedia applications within the 3GPP IM Subsystem. 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

X the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 



Introduction 



The present document contains a specification for required protocol usage within 3GPP specified Conversational Packet 
Switched Multimedia Services [5] which is based IP Multimedia Subsystem (IM Subsystem). IM Subsystem as a 
subsystem includes specifically the conversational IP multimedia services, whose service architecture, call control and 
media capability control procedures have been defined in 3GPP TS 24.229 [7], and are based on the 3GPP adopted 
version of IETF Session Initiated Protocol (SIP) [1]. 

In conversational packet switched multimedia service depends on IM Subsystem. The individual media types are 
independently encoded and packetized to appropriate separate Real Time Protocol (RTP) packets. These packets are 
then transported end-to-end inside UDP datagrams over real-time IP connections that have been negotiated and opened 
between the terminals during the SIP call as specified in 3GPP TS 24.229 [7]. 

The UEs operating within IM Subsystem need to provide encoding/decoding of the derived codecs, and perform 
corresponding packetization/depacketization functions. Logical bound between the media streams is handled in the SIP 
session layer, and inter-media synchronization in the receiver is handled with the use of RTP time stamps. 
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Scope 



The present document introduces the required protocols for packet switched conversational multimedia applications 
within 3GPP IP Multimedia Subsystem. Visual and sound communications are specifically addressed. The intended 
applications are assumed to require low-delay, real-time functionality. 

The present document describes the required protocol related elements for 3G PS multimedia terminal: 

• required SDP signalling regarding the media type bit rate, packet size, packet transport frequency; 

• usage of RTP payload for media types; 

• bandwidth adaptation; 

• QoS negotiation. 

The present document is applicable, but not limited, to packet switched video telephony. 
The applicability of the present document to GERAN is FFS. 



References 



The following documents contain provisions which, through reference in this text, constitute provisions of the present 
document. 

• References are either specific (identified by date of publication, edition number, version number, etc.) or 
non-specific. 

• For a specific reference, subsequent revisions do not apply. 

• For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including 
a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same 
Release as the present document. 

[I] IETF RFC 2543: "SIP: Session Initiation Protocol". 
[2] IETF RFC 2327: "SDP: Session Description Protocol". 

[3] IETF RFC 1889: "RTP: A Transport Protocol for Real-Time Apphcations". 

[4] IETF RFC 1890: "RTP Profile for Audio and Video Conferences with Minimal Control". 

[5] 3GPP TS 26.235: "Packet switched conversational multimedia applications; Default codecs". 

[6] 3GPP TS 24.228: "SignalHng flows for the IP multimedia call control based on SIP and SDP; 

stage 3". 

[7] 3GPP TS 24.229: "IP multimedia call control protocol based on SIP and SDP". 

[8] 3GPP TS 23.228: "IP Multimedia Ssubsystem (IMS); Stage 2". 

[9] 3GPP TS 23.107: "Quality of Service (QoS) concept and architecture". 

[10] 3GPP TS 23.207: "End to end quality of service concept and architecture". 

[II] 3GPP TS 23.060: "General Packet Radio Service (GPRS); Service description; Stage 2". 

[12] 3GPP TS 26.071: "Mandatory Speech Codec speech processing functions; AMR Speech Codec; 

General description". 

[13] 3GPP TS 26.090: "AMR speech Codec; Transcoding Functions". 

[14] 3GPP TS 26.073: "AMR speech Codec; C-source code". 
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[15] 3GPP TS 26.104: "ANSI-C code for the floating-point Adaptive Multi-Rate AMR speech codec". 

[16] 3GPP TS 26.171 (Release 5): "AMR speech codec, wideband; General description". 

[17] 3GPP TS 26.190 (Release 5): "Mandatory Speech Codec speech processing functions AMR 

Wideband speech codec; Transcoding functions". 

[18] 3GPP TS 26.201 (Release 5): "AMR speech codec, wideband; Frame structure". 

[19] 3GPP TS 26.235: "Packet switched conversational multimedia applications; Default codecs ". 

Annex B: "RTP payload format and storage format for AMR and AMR-WB audio". 

[20] ITU-T Recommendation H.263: "Video coding for low bit rate communication". 

[21] IETF RFC 2429: "RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video 

(H.263H-)". 

[22] ISO/lEC 14496-2 (1999): "Information technology - Coding of audio-visual objects - Part 2: 

Visual". 

[23] IETF RFC 3016: "RTP Payload Format for MPEG-4 Audio/Visual Streams". 

[24] ITU-T Recommendation H.263 (annex X): "Annex X: Profiles and levels definition". 

[25] 3GPP TS 26.235: "Packet Switched Conversational Multimedia AppHcations; Default Codecs ". 

Annex C: "ITU-T H.263 MIME media type registration". 

[26] ITU-T Recommendation T.140 (1998): "Protocol for multimedia application text conversation" 

(with amendment 2000). 

[27] IETF RFC 2793: "RTP Payload for Text Conversation". 

[28] IETF RFC 3578: "SDP bandwidth modifier for RTCP bandwidth". 



3 Definitions and abbreviations 

3.1 Definitions 

For the purposes of the present document, the following term and definition applies: 

3G PS multimedia terminal: terminal based on IETF SIP/SDP internet standards modified by 3GPP for purposes of 
3GPP IM Subsystem services 

3.2 Abbreviations 

For the purposes of the present document, the following abbreviations apply: 

AMR Adaptive MultiRate codec 

IETF Internet Engineering Task Force 

IM Subsystem Internet protocol Multimedia Subsystem 

ITU-T International Telecommunications Union-Telecommunications 

RFC IETF Request For Comments 

RTPCP RTP Control Protocol 

RTP Real-time Transport Protocol 

SDP Session Description Protocol 

SIP Session Initiation Protocol 
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General 



3G PS multimedia terminals provide real-time video, audio, or data, in any combination, including none, over 3GPP IM 
Subsystem. Terminals are based on IETF defined multimedia protocols SIP, SDP, RTP and RTCP. Communication 
may be either 1-way or 2-way. Such terminals may be part of a portable device or integrated into an automobile or other 
non-fixed location device. They may also be fixed, stand-alone devices; for example, a video telephone or kiosk. 
Multimedia terminals may also be integrated into PCs and workstations. 

In addition, interoperation with other types of multimedia telephone terminals, such as 3G-324M may be possible, 
however in such case a media gateway functionality supporting 3G-324M - IM Subsystem interworking will be 
required within or outside the IM subsystem. 

Figure 1 presents the user plane protocol stack of a 3G PS conversational multimedia terminal explaining the transport 
of different media types and QoS reports. 



Conversational Multimedia Application 



Audio 



Video 



Text 



Payload formats 



RTP 




UDP 



IP 



Figure 1 - User plane protocol stack for 3G PS conversational multimedia terminal 



Media type requirements 



Media type RTP payload usage is specified in this clause. The media types and corresponding codecs are specified in 
3GPP TS 26.235 [5]. The continuous media type RTP payloads are mapped to RTP packets according to IETF RTP 
Profile for Audio and Video Conferences with Minimal Control in RFC 1890 [4]. 



5.1 



Audio 



5.1 .1 RTP session description parameters 

The IETF AMR and AMR-WB RTP payload format [19] offers different options. Here is the list of options and how 
they should be used by the transmitter. The receiver shall at least support the options as they are listed: 

the bandwidth efficient operation shall be used, 

only one speech frame shall be encapsulated in each RTP packet, 

the multi-channel session shall not be used, 

interleaving shall not be used, 

internal CRC shall not be used. 
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5.2 Video 

Video packets should not be large to allow better error resilience and to minimize the transmission delay in 
conversational service. The size of each packet shall be kept smaller than 512 bytes. 

5.3 Real time text 

Real time text media type RTF payload format for ITU-T Recommendation T.140 is specified in [27]. Redundant 
transmission provided by the RTF payload format is recommended in error prone channel. 



6 Call control 

Functional requirements for call control are specified in 3GFF TS 23.228 [8]. 

The required signalling functions are specified in 3GFF TS 24.228 [6] and call control protocols in 

3GFFTS 24.229 [7]. 

QoS authorization issues and interworking with the IM subsystem in general are covered in 3GFF TS 23.207 [10]. 



7 Bearer control 

The media control is based on declaration of terminal media capability sets in SDF part of appropriate SIF messages. 
The usage of bearer bandwidth can be effectively controlled by adjusting the media type encoder bit rates. 

7.1 Bandwidth 

The bandwidth information of each media type shall be carried in SDF messages in both session and media type level 
during codec negotiation, session establishment and resource reallocation. Note that for RTF based applications, 
'b=AS:' gives the RTF "session bandwidth" (including UDF/IF overhead) as defined in section 6.2 of [3]. 

The bandwidth for RTCF traffic shall be described using the "RS" and "RR" SDF bandwidth modifiers at media level, 
as specified by [28]. Therefore, a conversational multimedia terminal shall include the "b=RS:" and "b=RR:" fields in 
SDF, and shall be able to interpret them. There shall be a limit on the allowed RTCF bandwidth for a session signalled 
by the terminal. This limit is defined as follows: 

• 4000 bps for the RS field (at media level); 

• 3000 bps for the RR field (at media level). 



7.2 QoS negotiation 



The QoS architecture and concept is specified in 3GFF TS 23.107 [9]. The end-to-end QoS framework involving GFRS 
and UMTS is specified in 3GFF TS 23.207 [10]. The applicable general QoS mechanism and service description for the 
GFRS in GSM and UMTS is specified in 3GFF TS 23.060 [11]. 

7.3 RTP receiver 

The RTF receiver implementation and functionality including lost and delayed packet processing as well as jitter buffer 
is out of scope of the present document. 
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Annex A (informative): 
Optional enhancements 

This annex is intended for informational purposes only. This is not an integral part of the present document. 



A.1 Video enhancements 

This clause gives informative recommendations for the video media type control. 

The SDP attributes regarding the video frame rate and the quality of media encoding should be used to ensure good 
video service. The recommended usage of these attributes are FFS. 

a=f ramerate : <f rame rate> describes the maximum video frame rate attribute in frames/second. Fractional 

values of <f rame rate> are allowed. 

a=quality : <quality> describes the quality of media encoding attribute, where the <quality> is a 

value in [0..10] with 10 indicating the best quality. 
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Annex B (informative): 

Mapping of SDP parameters to UIVITS QoS parameters 

This clause gives recommendations for mapping of SDP parameters in UMTS QoS parameters for conversational 
multimedia applications. Different use cases will be considered. Each use case generates an example QoS profile 
parameters table. The values indicated are derived by applications' QoS requirements, and may not be fulfilled by the 
network. In the parameters for guaranteed and maximum bit rates a granularity of 1 kbps is assumed for bearers up to 
64 kbps, as defined in the TS 24.008. Therefore the "Ceiling" function is used for up-rounding fractional values, 
wherever needed. In addition, the same specification defines a granularity of 10 bytes for the Maximum SDU sizes 
values. This is taken into account in the computation of this field in the QoS profile. 

Use case 1 - Voice over IP 

This use case includes the scenario in which two conversational multimedia terminals establish a bi-directional Voice 
over IP (VoIP) connection for speech communication, using the AMR or AMR-WB codecs with the same bit rate in 
both uplink and downlink directions. 

For example an AMR VoIP stream encoded at 12.2 kbps, with one speech frame encapsulated into an RTP packet, 
would yield IP packets of the following size (using the mandated bandwidth efficient mode): 

20 (IPv4) + 8 (UDP) + 12 (RTP) + 32 (AMR RTP payload) = 72 bytes, or 

40 (IPv6 with no extension headers) + 8 (UDP) + 12 (RTP) + 32 (AMR RTP payload) = 92 bytes. 



The gross bit rate including uncompressed RTP/UDP/IPv4 headers would be 28.8 kbps. The value in the b=AS media 
level parameter would be 29. 

To determine the Maximum SDU size parameter we should consider the maximum packet size that can be generated 
with a speech codec. This is exactly that generated by a AMR-WB stream at 23.85 kbps packetized in bandwidth 
efficient mode and with 1 speech frame per packet. Considering uncompressed RTP/UDP/IPv6 headers, the maximum 
packet size is 121 bytes. 



The QoS profile would be set then using the following parameters: 
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Table B.1 : QoS profile for AMR VoIP at 12.2 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


1 30 bytes 


10 bytes granularity. 
The RTCP packet 
size might change the 
maximum SDU size 
limitation [tbc] 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 
2.5% * (SDP media bw in DL+ SDP 
media bw in UL) = 
Ceil(30.45)=31 kbps 




Maximum bit rate for downlink 


Ceil(30.45)=31 kbps 




Guaranteed bitrate for uplink 


SDP media bw in UL + 
2.5% * (SDP media bw in UL+ SDP 
media bw in DL) = 
Ceil(30.45)=31 kbps 




Maximum bit rate for uplink 


Ceil(30.45)=31 kbps 




Residual BER 


10' 


16bitCRC 


SDU error ratio 


7*10"' 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Speech- 





In some cases, multiple AMR or AMR-WB rates are available, and rate control techniques allow to switch between 
different modes based on the received speech quality. For example, if the available AMR mode set is {4.75, 10.2, 12.2} 
kbps, the set of gross bit rates are: 



AMR 4.75 kbps: 21.6 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 22]. 
AMR 10.2 kbps: 26.8 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 27]. 
AMR 12.2 kbps: 28.8 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 29]. 



The maximum bit rate is set to the highest mode of the codec. However, the procedure on how to choose the 
guaranteed bit rate when several codec rates are available is to be defined. Here we provide an example QoS profile in 
which the guaranteed speech quality is at least that of 10.2 kbps AMR for both uplink and downlink directions, while 
the non-guaranteed maximum quality is that of 12.2 kbps for both uplink and downlink directions. 
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Table B.2: QoS profile for AMR VoIP at 3 bit rates with rate control 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


1 30 bytes 


10 bytes granularity. 
The RTCP packet 
size might change the 
maximum SDU size 
limitation [tbc] 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 
2.5% * (SDP media bw in DL+ SDP 
media bw in UL) = 
Ceil(28.35)=29 kbps 


Guaranteed quality 
10.2 kbps (media bw 
= 27 kbps) 


Maximum bit rate for downlink 


SDP media bw in DL + 
2.5% * (SDP media bw in DL+ SDP 
media bw in UL) = 
Ceil(30.35)=31 kbps 


Non-guaranteed 
quality 12.2 kbps 
(media bw = 29 kbps) 


Guaranteed bitrate for uplink 


SDP media bw in UL+ 
2.5% * (SDP media bw in UL+ SDP 
media bw in DL) = 
Ceil(28.35)=29 kbps 


Guaranteed quality 
10.2 kbps (media bw 
= 27 kbps) 


Maximum bit rate for uplink 


SDP media bw in UL + 
2.5% * (SDP media bw in UL+ SDP 
media bw in DL) = 
Ceil(30.35)=31 kbps 


Non-guaranteed 
quality 12.2 kbps 
(media bw = 29 kbps) 


Residual BER 


10 = 


16bitCRC 


SDU error ratio 


7*10' 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Speech- 





Use case 2 - Unidirectional video 

This use case includes the scenario in which two conversational multimedia terminals establish a uni-directional video 
connection, using the H.263 or MPEG-4 codecs. 

The video codec in this example has a bitrate of 36 kbps, with RTP payload packets of 75 bytes (excluding payload 
header which is, for example, 2 bytes). The sending terminal would produce IP packets of the following size: 



20 (IPv4) + 8 (UDP) + 12 (RTP) + 77 (video RTP payload-npayload header) =117 bytes, or 

40 (IPv6 with no extension headers) + 8 (UDP) + 12 (RTP) + 11 (video RTP payload-npayload header) = 137 bytes. 



The gross bit rate including uncompressed RTP/UDP/IPv4 headers would be 56.2 kbps. The value in the b=AS media 
level parameter would be 57. 
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The maximum video packet size is limited to 512 bytes in section 5.2. This value is fine if transmission occurs over the 
UMTS lu interface. However, in order to avoid SNDCP fragmentation of packets over the GERAN Gb interface (where 
the default size for LLC data field (=SNDCP frame) is 500 bytes) the maximum IP packet size is 500 - 4 
(unacknowledged mode SNDCP header) = 496 bytes. Therefore, the maximum size of a video packet is 496 - 60 
(RTP/UDP/lPv6 uncompressed headers) = 436 bytes (including RTP payload header). 400 bytes is a safer value. 



The QoS profile of the receiving terminal would be set then using the following parameters: 
Table B.3: QoS profile for unidirectional video at 36 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


500 bytes 


10 bytes granularity 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 

2.5% * (SDP media bw in DL) = 

Ceil(58.43)=59 kbps 




Maximum bit rate for downlink 


Equal or higher than guaranteed bit rate 




Guaranteed bitrate for uplink 


2.5% * (SDP media bw in DL) = 
Ceil(1 .43)=2 kbps 


For RTCP 


Maximum bit rate for uplink 


Equal or higher than guaranteed bit rate 




Residual BER 


10 = 


16bitCRC 


SDU error ratio 


10-^ 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


250 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Unknown" 





Use case 3 - Video telephony 

This use case includes the scenario in which two conversational multimedia terminals establish a bi-directional 
speech/video connection, using the AMR/AMR-WB and H.263/MPEG-4 codecs at the same bit rates in uplink and 
downlink directions. 

The video codec in this case has a bitrate of 28 kbps, with RTP payload packets of 250 bytes (excluding payload header 
which is, for example, 2 bytes). The total video bit rate is 32.7 kbps (including RTP/UDP/IPv4 headers). The value in 
the b=AS media level parameter would be 33. In the same bearer there is an AMR stream at 10.2 kbps with 1 frame 
encapsulated per RTP packet using the bandwidth efficient mode. The total voice bit rate is 26.8 kbps (including 
RTP/UDP/IPv4 headers). The value in the b=AS media level parameter would be 27. The total media bit rate is 
28+10.2=38.2 kbps. The total session bit rate is 33+27=60 kbps. 

The terminal would produce IP packets of the following size: 



AMR: 20 (IPv4) + 8 (UDP) + 12 (RTP) + 27 (AMR RTP payload) = 67 bytes (or 87 bytes for IPv6 with no extension 
headers). 
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Video: 20 (IPv4) + 8 (UDP) + 12 (RTF) + 252 (video RTF payload+payload header) = 292 bytes (or 312 bytes for IFv6 
with no extension headers). 



The same considerations done in Use Case 2 about the maximum packet sizes apply also for this use case. 
The QoS profile of the videotelephony terminal would be set then using the following parameters: 

Table B.4: QoS profile for videotelephony at 38.2 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


500 bytes 


10 bytes granularity 


Guaranteed bitrate for 
downlink 


SDP media bw in DL for AMR + 

2.5% * (SDP media bw in DL for AMR+ 

SDP media bw in UL for AMR) + 

SDP media bw in DL for video + 
2.5% * (SDP media bw in DL for video+ 
SDP media bw in UL for video) 
= 63 kbps 




Maximum bit rate for downlink 


Equal or higher than guaranteed bit rate 




Guaranteed bitrate for uplink 


SDP media bw in UL for AMR + 

2.5% * (SDP media bw in UL for AMR+ 

SDP media bw in DL for AMR) + 

SDP media bw in UL for video + 
2.5% * (SDP media bw in UL for video+ 
SDP media bw in DL for video) 
= 63 kbps 




Maximum bit rate for uplink 


Equal or higher than guaranteed bit rate 




Residual BER 


10 = 


16bitCRC 


SDU error ratio 


10-^ 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Unknown" 





In case of usage of separate FDF contexts for the speech and video streams, the speech stream QoS profile parameters 
are set similarly to use case 1, while the video stream QoS profile parameters are set similarly to use case 2 (but 
considering that the video flow is bi-directional and considering possibly the same UMTS bearer transfer delay 
constraints for both media). 
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